Azure AI Engineer Associate
Complete Study Course
A comprehensive, exam-aligned interactive course covering all 6 domains of the AI-102 exam. Work through every topic, then test yourself with 50 realistic practice questions.
📋Exam Domain Weights (2025–2026)
🎯Exam Quick Facts
| Passing Score | 700 / 1000 (70%) |
| Duration | 100 minutes |
| Questions | ~40–60 items (MCQ, case studies, labs) |
| Code Language | Python or C# — choose at start, cannot change |
| Validity | 12 months — free annual renewal via MS Learn |
| Retake Policy | Wait 24 hours after first fail; varies for subsequent |
| Price | ~$165 USD (varies by region) |
| Exam Focus (2025) | Product knowledge, service limits, GenAI, Foundry, Responsible AI |
📚Recommended Study Path
- Complete all 6 domain modules in order (click each in the sidebar)
- Use the code examples to practice with Azure SDKs (Python preferred)
- Do hands-on labs in Azure portal — create real resources
- Take the 50-question mock exam — aim for ≥85% before booking
- Review wrong answers — each has a full explanation
- Revisit the official Microsoft Learn paths for weak areas
- Book the exam via Pearson VUE at least 1 week out
1.1 — Microsoft Foundry Services & Selection
🏗Azure AI Foundry Overview
Azure AI Foundry (formerly Azure AI Studio) is Microsoft's unified platform for building, deploying, and managing enterprise AI solutions. It organizes work into Hubs and Projects.
| Concept | Role | Scope |
|---|---|---|
| Foundry Hub | Enterprise-level container; manages shared resources, security, network config | Organization-wide |
| Foundry Project | Team/workload container; inherits Hub security; where work actually happens | Team / solution |
| Foundry Agent Service | Managed service for creating and deploying AI agents | Project-level |
🔧Selecting the Right Azure AI Service
You must match a business scenario to the correct Azure AI service family. This is heavily tested.
| Requirement | Service to Select | Notes |
|---|---|---|
| Generate text, code, images from LLM | Azure OpenAI in Foundry | GPT-4o, DALL-E, Whisper |
| Analyze images, detect objects | Azure Vision in Foundry Tools | Formerly Computer Vision / Custom Vision |
| Speech-to-text, text-to-speech | Azure Speech in Foundry Tools | Formerly Azure Cognitive Speech |
| Translate text/documents | Azure Translator in Foundry Tools | Custom translator available |
| Extract key phrases, entities, sentiment | Azure Language in Foundry Tools | Formerly Text Analytics |
| Extract data from forms, invoices, IDs | Azure Document Intelligence | Formerly Form Recognizer |
| Full-text + vector search over docs | Azure AI Search | Formerly Cognitive Search |
| Index and search video content | Azure AI Video Indexer | Extract insights from video/audio |
| Moderate content in images/text | Azure AI Content Safety | Part of Responsible AI tooling |
| Understand docs, images, video, audio | Azure Content Understanding | New multimodal service in Foundry Tools |
⚙Creating & Deploying AI Services
Azure AI resources can be deployed as multi-service resources or single-service resources.
- Multi-service resource: Single endpoint and key for multiple AI services — simpler, fewer resources to manage.
- Single-service resource: Dedicated resource per service — better isolation, service-specific tiers and limits.
- Resources require a resource group, region, pricing tier, and name.
- Some services (e.g., OpenAI) require separate approved subscriptions.
🔑Authentication Methods
| Method | When to Use | Security Level |
|---|---|---|
| API Key (Ocp-Apim-Subscription-Key) | Simple apps, quick testing | Medium |
| Microsoft Entra ID (AAD) | Enterprise, production, RBAC required | High |
| Managed Identity | Azure-hosted apps, no credentials to manage | High |
| SAS Token | Storage access, time-limited access | Medium-High |
1.2 — Deploy, Monitor & Secure Services
📊Monitoring Azure AI Resources
Azure AI services integrate with Azure Monitor for metrics, logs, and alerts.
- Metrics: Request count, latency, error rate, throttled calls — available via Azure Monitor.
- Diagnostic Logs: Must be explicitly enabled. Send to Log Analytics, Storage Account, or Event Hub.
- Log Analytics Workspace: Use KQL (Kusto Query Language) to query logs.
- Alerts: Configure metric alerts for anomaly detection (e.g., error rate > 5%).
- Azure AI Foundry: Model monitoring for generative AI — monitors performance, safety, resource consumption.
🔒Network Security for Foundry Services
- Public endpoint: Default — accessible from internet (use firewall rules + key auth).
- Virtual Network (VNet) Service Endpoint: Restricts access to specific VNet subnets.
- Private Endpoint: Private IP in your VNet — no traffic over internet; best for enterprise.
- IP Firewall Rules: Allowlist specific IP ranges for public endpoint.
- Foundry Hub networking: Configure at Hub level — Projects inherit the network configuration.
💰Managing Costs
- Pricing tiers: Free (F0), Standard (S0), and premium tiers — F0 has usage limits (typically 5k calls/month).
- Token-based billing: Azure OpenAI bills per 1K input and output tokens separately.
- PTU (Provisioned Throughput Units): Reserved capacity for Azure OpenAI — predictable latency and cost.
- Azure Cost Management: Use budgets and alerts to prevent overspend.
- Content caching: Prompt caching can reduce OpenAI costs for repeated prompts.
🔄CI/CD Integration
Integrate Azure AI resources into automated pipelines using:
- Azure DevOps / GitHub Actions: Deploy AI resources via ARM templates, Bicep, or Terraform.
- Azure AI Foundry SDK: Use
azure-ai-projectsPython SDK to automate model deployment in pipelines. - Model versioning: Track model versions; use deployment slots for staged rollouts.
- Prompt Flow: CI/CD for LLM applications — version, test, and deploy prompt flows as endpoints.
1.3 — Responsible AI
⚖Microsoft's 6 Responsible AI Principles
🛡Azure AI Content Safety Features
| Feature | What It Does | Use Case |
|---|---|---|
| Content Filters | Block harmful categories (hate, sexual, violence, self-harm) in LLM I/O | All GenAI deployments |
| Prompt Shields | Detect jailbreak attempts and prompt injection attacks | Protect system prompts |
| Groundedness Detection | Detect hallucinations — ungrounded claims in RAG responses | RAG/QA systems |
| Blocklists | Custom word/phrase lists to block specific content | Brand protection, compliance |
| Harm Detection | Detect self-harm, extremist content, material endangering safety | Consumer apps |
| Content Moderation API | Moderate text and images for safe categories | UGC platforms |
📋Responsible AI Governance Framework
Designing a governance framework involves:
- Impact Assessment: Identify potential harms before deployment
- Model Cards: Document model capabilities, limitations, and intended uses
- Human-in-the-loop: Define when human review is required
- Monitoring: Continuously monitor for fairness drift and harmful outputs
- Incident Response: Define escalation path for AI failures
- Data Governance: Document training data sources and consent
2.1 — Building GenAI Solutions with Azure Foundry
🏗Azure AI Foundry Setup
To build a GenAI solution in Foundry:
- Create a Foundry Hub (top-level, manages shared infra + security)
- Create a Foundry Project under the Hub (workspace for the solution)
- Deploy a model within the project (GPT-4o, Llama 3, etc.)
- Build with Prompt Flow or SDK
- Evaluate with built-in evaluators
- Deploy as a managed online endpoint
🔄RAG — Retrieval Augmented Generation
RAG grounds model responses in your private data. Key components:
| Component | Azure Service | Role |
|---|---|---|
| Embedding Model | text-embedding-ada-002 | Convert text to vectors |
| Vector Store | Azure AI Search | Store and search vectors |
| Retriever | AI Search (hybrid) | Find relevant chunks |
| Generator | Azure OpenAI (GPT-4o) | Generate grounded answer |
| Orchestration | Prompt Flow / LangChain | Connect pipeline |
📊Prompt Flow
Prompt Flow is an LLM-ops tool for building, testing, and deploying LLM applications.
- Flow types: Standard (general), Chat (multi-turn), Evaluation (scoring)
- Nodes: LLM, Python, Prompt, Tool — connected as a DAG
- Variants: Test different prompt versions in parallel
- Evaluators: Built-in metrics — groundedness, relevance, coherence, fluency, similarity
- Deployment: Deploy flows as REST endpoints with auto-scaling
📐Prompt Engineering Techniques
| Technique | Description | Best For |
|---|---|---|
| Zero-shot | No examples; rely on model knowledge | Simple, well-defined tasks |
| Few-shot | Provide 2–5 examples in the prompt | Pattern-following tasks |
| Chain-of-Thought | "Think step by step" — intermediate reasoning | Math, logic, multi-step problems |
| System Message | Define persona, constraints, output format | All chat applications |
| Meta Prompting | Ask model to generate its own prompt structure | Complex workflows |
| Retrieval Augmentation | Inject retrieved context into prompt | QA over private data |
2.2 — Azure OpenAI Service & Models
🤖Azure OpenAI Models
| Model | Capability | Key Use |
|---|---|---|
| GPT-4o | Multimodal: text, image, audio input | Complex reasoning, vision tasks |
| GPT-4o mini | Fast, cost-efficient version of GPT-4o | High-volume, lower-cost use cases |
| o1 / o3 | Reasoning models — extended thinking | Math, science, code reasoning |
| text-embedding-3 | Generate text embeddings for semantic search | RAG, semantic similarity |
| DALL-E 3 | Generate images from text prompts | Creative content, image generation |
| Whisper | Speech-to-text transcription | Audio transcription, subtitles |
| TTS | Text-to-speech (via OpenAI endpoint) | Spoken responses |
⚙Key Parameters for Generation Control
| Parameter | Range | Effect |
|---|---|---|
| temperature | 0.0 – 2.0 | Randomness — 0=deterministic, 1=creative, 2=chaotic |
| top_p | 0.0 – 1.0 | Nucleus sampling — prefer over temperature; 0.9 = top 90% likely tokens |
| max_tokens | 1 – model max | Maximum output tokens (affects cost) |
| frequency_penalty | -2.0 – 2.0 | Reduce repetition of frequent tokens |
| presence_penalty | -2.0 – 2.0 | Encourage new topics in output |
| stop | string[] | Stop generation when sequence is found |
| seed | integer | Reproducible outputs (same seed = same result) |
🖼DALL-E & Multimodal
DALL-E 3 generates images from text via the Azure OpenAI endpoint:
For GPT-4o vision, pass image as base64 or URL in the user message content array with type: "image_url".
🎛Fine-tuning
Fine-tuning adapts a base model on your domain-specific data:
- Supported models: GPT-4o mini, GPT-3.5 Turbo (check current availability)
- Training format: JSONL with
{"messages": [...]}pairs - Minimum dataset: 10 examples (recommend 50–100+ for good results)
- When to fine-tune: Consistent style/format, domain terminology, not for adding new knowledge (use RAG instead)
- Cost: Training cost + higher inference cost vs base model
2.3 — Optimize & Operationalize GenAI
📈Model Evaluation Metrics
| Metric | What It Measures | Scale |
|---|---|---|
| Groundedness | Are claims in the response supported by the retrieved context? | 1–5 |
| Relevance | How relevant is the response to the question? | 1–5 |
| Coherence | Is the response logically consistent and well-structured? | 1–5 |
| Fluency | Is the response grammatically correct and readable? | 1–5 |
| F1 Score | Token overlap between generated and ground truth answers | 0–1 |
| Similarity | Semantic similarity to ground truth | 0–1 |
🔍Tracing & Feedback
- Azure AI Foundry tracing: Track each step in a prompt flow — inputs, outputs, latency per node.
- OpenTelemetry integration: Export traces to Azure Monitor / Application Insights.
- Feedback collection: Collect thumbs up/down ratings; feed back into evaluation datasets.
- Model reflection: Use a secondary LLM call to self-critique the primary response.
🌊Deployment & Scalability
- Standard deployment: Pay-per-token, auto-scales to Azure capacity.
- PTU deployment: Provisioned Throughput Units — reserved compute, consistent latency, hourly billing.
- Container deployment: Deploy Foundry models to ACI, AKS, or edge (IoT) using containers.
- Multi-model orchestration: Route different request types to specialized models (e.g., GPT-4o for complex, GPT-4o mini for simple).
3.1 — Building Custom Agents
🤖What is an AI Agent?
An AI agent is an autonomous system that perceives its environment, makes decisions, and takes actions to achieve goals — using LLMs as the reasoning engine.
- Core loop: Perceive → Reason → Act (ReAct pattern)
- Tools: Agents call external tools/APIs — code interpreter, search, databases, custom APIs
- Memory: Short-term (conversation context), long-term (vector stores), episodic
- Planning: Break complex tasks into sub-tasks; execute step-by-step
- Reflection: Self-evaluate outputs and retry if needed
🏗Microsoft Foundry Agent Service
The Azure AI Foundry Agent Service provides a managed platform for building and deploying agents:
- Create agents via Foundry portal or SDK with instructions, model, tools, and thread management.
- Built-in tools: Code Interpreter, File Search (RAG), Function Calling, Azure AI Search, Bing Search.
- Threads: Manage conversation state; messages persist in thread context.
- Runs: Execute agent on a thread; poll for completion or use streaming.
🔗Multi-Agent Orchestration
- Microsoft Agent Framework (Semantic Kernel / AutoGen): Build complex workflows with multiple specialized agents.
- Orchestrator agent: Routes tasks to specialized sub-agents.
- Sub-agents: Each handles a specific domain (e.g., SQL agent, document agent, search agent).
- Autonomous capabilities: Agents can trigger other agents; use event-driven patterns.
- Testing: Test agents with diverse scenarios; evaluate tool use accuracy and goal completion.
4.1 — Image Analysis & OCR
🖼Azure Vision in Foundry Tools — Capabilities
| Feature | What It Returns | API Parameter |
|---|---|---|
| Tags | Descriptive labels (e.g., "dog", "outdoor") with confidence scores | visualFeatures=Tags |
| Objects | Detected objects with bounding boxes | visualFeatures=Objects |
| Description | Natural language caption of the image | visualFeatures=Description |
| Faces | Face detection with bounding boxes (no identification) | visualFeatures=Faces |
| Color | Dominant/accent colors, black&white detection | visualFeatures=Color |
| Image Type | Clip art, line drawing detection | visualFeatures=ImageType |
| Adult | Adult/racy content scores (0–1) | visualFeatures=Adult |
| Smart Crops | Suggested crop regions | smartCrops parameter |
📄OCR & Read API
The Read API (Azure Vision 4.0 Read) extracts text from images and documents:
- Input formats: JPEG, PNG, BMP, TIFF, PDF — max 50 MB, max 10,000 pages
- Languages: 164+ languages supported
- Handwriting: Detect and extract handwritten text
- Response: Returns text lines, words, bounding polygons, confidence scores
- Async operation: For large documents — submit and poll with operation ID
4.2 — Custom Vision Models & Video Analysis
🎯Custom Vision — Classification vs Detection
| Type | Output | Min Training Images | Use Case |
|---|---|---|---|
| Image Classification | Class label + confidence for whole image | 5 per class | "Is this a cat or dog?" |
| Object Detection | Bounding boxes + labels for multiple objects | 15 per class | "Find all cars in image" |
Training workflow: Upload images → Label → Train → Evaluate (Precision/Recall/AP) → Publish → Consume via prediction URL
🎬Video Analysis Services
| Service | Capabilities | Use Case |
|---|---|---|
| Azure AI Video Indexer | Transcription, speaker diarization, face detection, OCR in video, key frame extraction, content moderation, topic detection | Media archives, searchable video, compliance |
| Spatial Analysis (Vision) | Count people in zones, detect line crossing, track movement in video streams | Retail occupancy, safety compliance, queue management |
Video Indexer insights include: transcript, OCR, keywords, labels, scenes, shots, keyframes, faces, named people, brands, sentiments.
5.1 — Text Analytics & Translation
📝Azure AI Language — Text Analytics Features
| Feature | Output | Notes |
|---|---|---|
| Sentiment Analysis | Positive/Negative/Neutral/Mixed + confidence per sentence | Opinion mining shows aspect-level sentiment |
| Key Phrase Extraction | Main concepts/phrases from text | Max 5,120 chars per document |
| Entity Recognition (NER) | Named entities: Person, Org, Location, DateTime, etc. | 18 entity categories |
| Entity Linking | Disambiguate entities with Wikipedia links | Useful for knowledge graphs |
| PII Detection | Personal data: SSN, email, phone, credit card, etc. | Returns categories and redacted text |
| Language Detection | ISO 639-1 language code + confidence | Returns "unknown" if confidence low |
| Text Summarization | Extractive or abstractive summary | Extractive: picks existing sentences |
🌍Azure Translator in Foundry Tools
- Text translation: Translate text between 100+ languages in a single API call
- Document translation: Translate entire documents while preserving layout — async batch operation
- Transliteration: Convert script without translating (e.g., Arabic → Latin letters)
- Language detection: Auto-detect source language
- Dictionary lookup: Find alternative translations and word usage examples
- Custom Translator: Fine-tune translation for domain-specific terminology (e.g., legal, medical)
5.2 — Speech Services
🎙Azure Speech in Foundry Tools
| Capability | Description | Key Config |
|---|---|---|
| Speech-to-Text (STT) | Real-time or batch transcription from audio | Language, audio format, diarization |
| Text-to-Speech (TTS) | Convert text to natural speech | Voice name, language, SSML |
| Speech Translation | Real-time speech translation to multiple targets | Source/target languages |
| Custom Speech | Adapt STT for domain vocabulary/accents | Training data, acoustic model |
| Custom Neural Voice | Create brand-specific TTS voice | Requires speaker consent + recording |
| Intent Recognition | Integrate with LUIS/CLU for intent extraction from speech | LUIS app ID required |
| Keyword Recognition | Always-on wake-word detection | Keyword model file (.table) |
🎨SSML — Speech Synthesis Markup Language
SSML allows fine-grained control over TTS output:
SSML elements: <voice> (select voice), <prosody> (rate/pitch/volume), <break> (pause), <emphasis> (stress), <say-as> (dates, ordinals), <phoneme> (pronunciation).
🔊Audio Formats for Speech
| Direction | Supported Formats | Recommended |
|---|---|---|
| Input (STT) | WAV (PCM 16-bit), MP3, OGG, FLAC, AMR, WebM | WAV PCM 16kHz 16-bit mono |
| Output (TTS) | WAV, MP3, OGG, RAW, RIFF | MP3 for streaming, WAV for quality |
5.3 — Custom Language Models & QnA
🧠Conversational Language Understanding (CLU)
CLU (successor to LUIS) builds custom intent/entity recognition models:
| Component | Description | Example |
|---|---|---|
| Intent | The action the user wants to perform | "BookFlight", "CancelOrder" |
| Entity | Key information extracted from utterance | City, Date, ProductName |
| Utterance | Sample user input used for training | "Book a flight to Paris tomorrow" |
| None intent | Catch-all for out-of-scope inputs | — |
Workflow: Create project → Add intents/entities → Add utterances → Train → Evaluate (F1/Precision/Recall) → Deploy → Consume
❓Custom Question Answering (QnA)
Build FAQ-style knowledge bases that answer questions from documents:
- Sources: URLs (FAQ pages), files (PDF, DOCX, TXT, TSV), manual Q&A pairs
- Multi-turn conversations: Add follow-up prompts to create guided conversation flows
- Chit-chat: Add personality responses (Professional, Friendly, Witty, etc.)
- Alternate phrasings: Add variant ways to ask the same question
- Active learning: System suggests improvements based on low-confidence answers
- Multi-language: Set project language or use per-document language detection
- Export: Export as TSV or JSON for backup/migration
🌐Custom Translator
- Train custom translation models for domain-specific terminology
- Training data: Parallel sentence pairs (source + target), bilingual documents
- Min training data: 1,000 sentence pairs (recommend 10,000+)
- Publish to custom category; call via
categoryparameter in Translator API - Iterative improvement: Add more data, retrain, compare BLEU scores
6.1 — Azure AI Search
🔍Core Architecture
Azure AI Search (formerly Cognitive Search) has 4 main components:
| Component | Role | Key Concepts |
|---|---|---|
| Data Source | Connection to raw data (Azure Blob, SQL, Cosmos DB, ADLS) | Connection string, container/table name |
| Index | Schema defining searchable fields and their properties | Searchable, Filterable, Sortable, Facetable, Retrievable |
| Skillset | AI enrichment pipeline applied during indexing | Built-in + custom skills, knowledge store |
| Indexer | Orchestrates the pipeline — reads source, applies skills, writes to index | Schedule, field mappings, output field mappings |
⚡Built-in Cognitive Skills
| Skill | Input | Output |
|---|---|---|
| OCR Skill | Image | text, layoutText |
| Image Analysis Skill | Image | tags, description, objects |
| Entity Recognition Skill | text | persons, organizations, locations, etc. |
| Key Phrase Skill | text | keyPhrases |
| Sentiment Skill | text | score, label |
| Language Detection Skill | text | languageCode, languageName |
| Split Skill | text | textItems (chunks) |
| Merge Skill | multiple strings | mergedText |
| Shaper Skill | multiple fields | complex shaped object |
| Azure OpenAI Embedding Skill | text | vector embedding |
📦Custom Skills
Extend skillsets with custom logic:
- Custom Web API skill: Call any HTTPS endpoint that accepts the skill contract (JSON in/out)
- Azure Function skill: Easiest pattern — deploy Azure Function, register as custom skill
- Azure ML skill: Call Azure ML endpoint for custom model inference
- Input/Output schema: Must match skillset contract — array of values with document key
🔎Query Types & Syntax
| Query Type | Syntax | Use Case |
|---|---|---|
| Simple | Default — keyword matching with +/- operators | Basic search |
| Full Lucene | queryType=full — wildcards, regex, fuzzy (~), proximity, boosting | Advanced text search |
| Semantic | queryType=semantic — AI re-ranking + captions/answers | Natural language QA |
| Vector | vector field + nearest neighbor — cosine similarity | Semantic similarity search |
| Hybrid | Full-text + vector in single query — RRF fusion scoring | Best of both — RAG pipelines |
6.2 — Azure Document Intelligence
📋Prebuilt Models
| Model | Extracts | Key Fields |
|---|---|---|
| Invoice | Vendor, customer, line items, totals | InvoiceId, VendorName, TotalTax, DueDate |
| Receipt | Merchant, transaction, items, totals | MerchantName, TransactionDate, Total |
| ID Document | Passport, driver's license fields | FirstName, LastName, DOB, DocumentNumber |
| Business Card | Contact information | ContactNames, Emails, PhoneNumbers |
| W2 | US tax form fields | Employee, Employer, Wages, FederalTax |
| Read | Text extraction only (no structure) | content, pages, lines, words |
| Layout | Text + structure (tables, selections) | paragraphs, tables, selectionMarks |
| General Document | Key-value pairs + layout | keyValuePairs, entities |
🎨Custom Models
| Type | When to Use | Min Training Docs |
|---|---|---|
| Custom Template | Fixed-layout documents (forms with consistent structure) | 5 labeled samples |
| Custom Neural | Variable-layout documents (diverse formats) | 10 labeled samples |
| Composed Model | Route between multiple custom models based on document type | N/A (combines existing models) |
Labeling workflow: Upload samples to Azure Blob → Label in Document Intelligence Studio → Train → Evaluate (F1/accuracy per field) → Publish → Consume
docType field indicating which sub-model was selected.
📁Supported Formats & Limits
| Property | Value |
|---|---|
| File formats | PDF, JPEG, PNG, BMP, TIFF, HEIF (modern models) |
| Max file size | 500 MB per file |
| Max pages | 2,000 pages per request (standard) |
| Min image dimensions | 50 × 50 pixels |
| Max image dimensions | 10,000 × 10,000 pixels |
6.3 — Azure Content Understanding
🌐Azure Content Understanding Overview
Azure Content Understanding (in Foundry Tools) is a new multimodal extraction service that goes beyond Document Intelligence — it processes documents, images, videos, and audio to extract structured insights.
| Capability | Description |
|---|---|
| OCR Pipeline | Extract text from images and documents — high accuracy with layout preservation |
| Document Understanding | Summarize, classify, detect attributes, extract entities/tables/images from documents |
| Video Processing | Frame analysis, scene detection, transcription, visual grounding in video |
| Audio Processing | Transcription, speaker diarization, sentiment from audio |
| Field Extraction | Schema-based extraction — define fields you want and it extracts them from any content type |